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Abstract 

This  article  presents  a  novel  Sparse  Constant  Propagation  technique  which  provides  a  heretofore 
unknown  level  of  practicality.  Unlike  other  techniques  which  arc  based  on  data  flow,  it  is  based  on  the 
execution-order  summarization  sweep  employed  in  Memory  Classification  Analysis  (MCA),  a  technique 
originally  developed  for  array  dependence  analysis.  This  methodology^  achieves  a  precise  description 
of  memory  reference  activity  within  a  summary  representation  that  grows  only  linearly  with  program 
size.  Because  of  this,  the  collected  sparse  constant  information  need  not  be  artificially  limited  to  satisfy 
classical  data  flow  lattice  requirements,  which  constrain  other  algorithms  to  discard  information  in  the 
interests  of  efficient  termination.  Sparse  Constant  Propagation  is  not  only  more  effective  within  the 
MCA  framework,  but  it  in  fact  generalizes  the  framework.  Original  MCA  provids  the  means  to  break 
only  simple  induction  and  reduction  types  of  flow-dependences.  The  integrated  framework  provides  the 
means  to  also  break  flow-dependences  for  which  array  values  can  be  propagated. 


1  Introduction 

The  propagation  of  sparse  constants  within  aggregate  data  structures  has  proven  significantly  more  difficult 
to  perform  practically  than  the  propagation  of  scalar  constants[WZ91].  In  fact,  the  only  work  focusing 
on  this  problem  to  date[SK98]  suggests  that  in  the  interests  of  efficiency  no  more  than  a  small  number  of 
constant  elements  be  collected  for  each  program  aggregate.  The  fundamental  difficulty  in  generalizing  scalar 
propagation  to  aggregate  propagation  has  been  that  the  classical  data  flow-based  abstract  interpretation, 
which  can  provide  only  a  bounded  number  of  constants  with  respect  to  the  program  size  in  the  scalar  case, 
can  produce  an  unbounded  number  of  constant  elements  within  an  aggregate. 

This  basic  difficulty  is  of  course  not  confined  merely  to  the  propagation  of  constant  values,  but  arises  when¬ 
ever  analysis  of  program  aggregates  is  required.  It  happens  that  the  summarization  of  the  memory  reference 
behavior  of  aggregates  has  been  an  extremely  active  area  of  research  within  the  automatic  parallelization 
community,  where  good  aggregate  summarization  techniques  have  been  held  to  hold  the  key  to  effective  in¬ 
terprocedural  dependence  analysis  for  parallelization.  Since  dependence  analysis  in  the  scalar  case  computes 
a  superset  of  the  information  required  for  constant  propagation,  as  constants  are  propagated  precisely  across 
flow  dependences,  it  turns  out  that  the  summarization  techniques  employed  for  general  dependence  analysis 
can  be  useful  in  generalized  constant  propagation.  We  will  also  see  that  the  dependence  analysis  is  itself 
considerably  enhanced  when  combined  with  this  propagation. 

This  paper  is  organized  as  follows.  First,  Section  2  provides  a  brief  overview  of  basic  dependence  analysis  for 
parallelization  and  reviews  aggregate  summarization  techniques,  focusing  on  the  promising  Linear  Memory 
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Access  Descriptor  (LMAD).  Section  3  describes  the  Memory  Classification  Analysis  (MCA)  build  around 
the  LMAD,  and  Section  4  describes  the  extensions  to  the  LMAD  and  to  MCA  which  would  support  con¬ 
stant  propagation  in  tandem  with  dependence  analysis.  Section  5  then  shows  how  the  integration  of  these 
techniques  leads  to  an  improved  MCA.  Section  6  considers  some  advanced  examples.  The  following  Sec¬ 
tion  7  details  limitations  and  possible  extensions  of  our  work.  Section  8  reviews  related  work  and  Section  9 
concludes. 


2  Array  Summarization  for  Dependence  Analysis 


An  important  motivation  for  dependence  analysis  is  program  parallelization[AK87],  where  it  is  desired  to 
shorten  the  execution  time  of  a  program  by  splitting  its  execution  across  a  number  of  processors.  This 
program  division  will  be  incorrect  if  it  does  not  respect  program  dependences.  There  are  three  important- 
types  of  dependences:  flow,  anti  and  output.  A  statement  B  is  said  to  be  flow- dependent  on  a  statement  A 
if  it  reads  a  variable  which  A  has  written,  anti-dependent  on  B  if  it  writes  a  variable  the  B  has  read,  and 
output- dependent  on  B  if  it  writes  a  variable  which  B  has  written.  The  latter  two  types  of  dependence  are 
memory-related  and  can  be  “broken”  by  transforming  the  program  to  perform  the  write  at  B  (and  subsequent- 
reads  of  the  value)  to  a  fresh  variable.  SSA  form[CFR+91]  efficiently  realizes  this  transformation. 

However,  the  ineffectiveness  of  SSA  form  in  the  context  of  program  arrays  has  resulted  in  the  adoption  of 
ad-hoc  methods  for  dependence  testing  of  array  references.  These  methods  provided  a  means  for  deciding 
the  existence  of  dependence  between  any  two  array  references  based  on  their  subscript-  expressions  and 
surrounding  loop  bounds.  This  decision  algorithm  would  then  be  applied  t-o  each  pair  of  possibly  aliased 
array  references  within  the  program  section  of  interest-,  typically  a  candidate  loop.  While  sufficient-  for  small 
programs,  t-he  high  asymptotic  cost  of  pairwise  comparisons  has  limited  its  application  t-o  larger  codes. 

To  obtaining  pragmatic  implementations,  various  access  summarization  techniques  have  been  studied,  and 
a  number  of  different,  paradigms  have  gained  popularity.  Linear  constraint  techniques  [TIF86]  are  based  on 
finding  a  Fourier-Mot-zkin  [Wil76]  solution  t-o  a  system  of  linear  inequalities  which  characterize  the  conditions 
under  which  a  pair  of  array  accesses  can  overlap.  Although  t-he  Fourier-Mot-zkin  procedure  is  expensive  and 
produces  real  solutions  rather  than  only  t-he  relevant  integer  ones,  t-he  Omega  Test.[Pug91]  has  extended 
it-  t-o  alleviate  these  complications.  However,  precise  linear  constraints  cannot  completely  describe  accesses 
involving  non-affine  conditionals  or  when  t-he  overall  shape  of  mi  access  does  not.  form  a  convex  hull.  Because 
t-he  latter  can  be  produced  by  an  intersection  operation  between  two  otherwise  convex  regions,  frequent- 
approximation  is  inevitable. 

Triplet-  notation [HK91],  in  which  each  access  region  is  characterized  by  an  upper  bound,  lower  bound  and 
stride  in  each  dimension,  has  also  gained  a  number  of  proponents.  In  Gu,  et.  al’s[GLL97]  interpretation  of 
this  scheme,  every  access  contains  a  guard  specifying  t-he  conditions  under  which  it.  occurs.  Approximation 
is  avoided  by  maintaining  high-level  representations  of  operations  which  cannot  be  performed  precisely.  For 
example,  a  difference  list  of  regions  is  associated  with  a  region  when  not  enough  is  known  t-o  perform  a  required 
difference  operation.  However,  complications  may  arise  in  the  presence  of  Fortran-style  array  reshapes,  for 
which  no  general  method  for  relating  accesses  interprocedurally  has  been  devised.  Furthermore,  this  scheme 
relies  on  the  dubious  assumption  that  all  array  access  occur  within  declared  bounds. 

Recently,  Paek,  et-  al.[PHP98]  have  proposed  a  representation  based  on  t-he  Linear  Memory  Access  Descriptor 
(LMAD).  The  LMAD  achieves  maximum  precision  by  describing  accesses  uniformly  as  a  (recursive)  sequence 
of  equidistant  element  references.  An  arbitrarily  complex  multidimensional  traversal  can  be  uniquely  repre¬ 
sented  (up  t-o  LMAD  equivalence)  with  only  a  base  offset,  and  a  series  of  stride/span  pairs  (access  dimensions). 
The  size  of  each  element  completes  t-he  exact-  memory  map  for  t-he  access. 

LMAD  construction  for  an  array  access  starts  with  t-he  formation  of  a  linearized  access  expression  which  is  the 
sum  of  t-he  products  of  each  dimension’s  access  expression  and  the  size  of  each  element-  of  that  dimension, 
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REAL  A (14,*) ; 


DO  1=1,2 

(jZ,  14,  26 
\^Q,  14.  26 

-  26) 

DO  J=1 , 2 

(Ah  J1  +  267  - 

-  26) 

DO  K=1 ,10,3 

+  267  +  14,7  - 

-  26) 

...  A(J,K+26*(I-1))  .. 

.  (A  +  267  +  14,7  +  K  - 

-  26) 

END  DO 
END  DO 
END  DO 


Figure  1:  Hoeflingers  Example  Showing  Expansion  Through  Indices 


and  in  which  each  index  variable  has  been  normalized.  This  symbolic  expression  is  repeatedly  expanded 
through  successive  index  variables,  starting  with  the  innermost.  As  each  index  variable  is  substituted  out, 
a  stride/span  pair  is  inserted  into  the  LMAD.  The  stride  is  set  to  the  increase  in  base  expression  due  to  the 
increment  by  one  of  the  normalized  index  variable,  and  the  span  is  set  to  the  difference  between  the  base  offset 
expression  with  the  normalized  index  variable  set  to  its  minimum  and  maximum  values.  Figure  1  presents  a 
corrected  version  of  Hoeflingers  LMAD  construction  example[Hoe98]  detailing  the  expansion  through  each 
index  variable. 

Conditional  statements  are  taken  into  account  by  associating  each  LMAD  with  an  execution  predicate. 
Because  the  LMAD  uses  array  declarations  in  the  same  manner  they  are  used  in  a  compiler,  namely,  to 
calculate  actual  element  offsets,  it  avoids  the  complexity  of  continually  relating  subscript  expressions  to 
array  bounds  expressions.  By  explicitly  linearizing  array  accesses,  the  LMAD  offers  a  level  of  accuracy 
unmatched  by  the  other  representations:  it  directly  supports  programmer  linearization  of  array  references, 
traditionally  the  bane  of  dependence  analysis,  and  offers  full  Fortran-style  array  reshape  transparency  while 
avoiding  the  in-bounds  assumption. 


3  Memory  Classification  Analysis 

Memory  Classification  Analysis  is  an  execution-order  sweep  through  the  program,  which  is  assumed  to  be 
comprised  of  a  series  of  nested  contexts.  Summary  sets  which  describe  the  memory  activity  for  each  of  these 
contexts  are  generated  in  a  manner  in  which  a  single  representative  dependence  of  a  particular  type  is  present 
between  the  summary  sets  of  two  different  contexts  exactly  when  there  is  an  actual  dependence  between 
the  contexts.  There  are  three  summary  sets  for  each  context,  or  grain:  Read-Only  (EO),  Read/Write  (RW) 
find  Write-First  (WF).  Together,  they  contain  the  collection  of  LMADs  describing  rill  memory  activity.  All 
dependences  between  two  grains  A  and  D  are  then  precisely  characterized  by  following  equations: 

Flow:  {A no  n  Bhw)  U  {AWF  D  BHO)  U  {AWF  D  BRW)  U  (Arw  HBro)  U  (ArW  D  Bmv)  <>  0 

Anti  /Output:  (A  rq  n  B\yf)  U  (A\yf  H  Civ7  f  )  U  (Awr  H  B\vf)  <>  0 

Flow-dependences,  unless  of  a  simple  inductive  or  reductive  nature,  prevent  parallelization  across  dependence 
grains,  while  anti-  and  output-dependences  can  be  removed  using  variable  privatization. 

The  Access  Region  Test[HPP96]  uses  this  classification  scheme  to  check  for  dependences  between  loop  it¬ 
erations.  First  summary  sets  are  formed  for  a  single  iteration  of  the  loop.  Then  these  sets  are  expanded 
through  the  loop  index  variable.  One  check  is  made  for  internal  overlap  in  any  of  the  descriptors  contained 
in  one  of  the  two  write  sets.  Another  check  is  made  for  intersection  among  any  of  the  three  summary  sets. 
The  final  check  for  intersection  between  descriptors  within  the  same  write  set  completes  the  cross-iteration 
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dependence  test. 

The  MCA  framework  also  incorporates  conditional  array  access,  by  attaching  a  (true  by  default)  predicate 
to  each  Access  Region  Descriptor  (ARD).  The  framework  is  kept  efficient  by  the  aggressive  coalescing  of 
related  ARDs.  For  example,  a  series  of  writes  to  successive  array  indices  under  the  similar  conditions 
will  be  incorporated  into  the  appropriate  write  set  as  only  a  single  ARD.  Aggressive  simplification  keeps 
the  analysis  efficient,  as  the  analysis  cost  rises  with  the  size  of  each  summary  set.  To  maintain  complete 
precision,  however,  no  coalescing  is  performed  in  the  case  where  attached  predicates  are  dissimilax. 


4  Sparse  Constant  Propagation  Extensions 


The  essential  observation  is  that  the  Memory  Classification  Analysis  can  perform  constant  propagation 
almost  for  free.  This  may  be  accomplished  by  adding  a  “current  value”  field  to  each  LMAD.  Of  course,  this 
field  will  only  be  set  for  LMADs  in  the  write  sets.  This  field  is  then  used  much  as  any  attached  predicate  as 
a  discriminating  factor  for  future  coalescing  operations. 

In  Figure  2  we  show  several  styles  of  initializations  with  which  our  enhanced  MCA  must  deal.  The  example 
in  Figure  2(a)  is  taken  from  the  regex  library  distributed  with  the  GNU  FSF  version  of  grep[Foub].  Since 
none  of  the  accessed  array  sections  overlap,  no  special  action  over  Paek;s  MCA  is  taken,  other  than  the 
integration  of  the  constant  initializer  value  as  part  of  the  LMAD. 

Figure  2(b),  taken  from  the  Huffman  code  inflater  of  the  GNU  FSF  version  of  gzip[Fouc],  is  slightly  more 
complex.  In  this  case,  the  original  MCA  would  actually  coalesce  the  LMADs  produced  by  the  series  of 
assignments  together.  In  our  case,  we  respect  the  differing  constant  values  by  disallowing  the  coalescing 
operation  unless  the  assigned  values  also  match. 

The  third  example  of  Figure  2(c)  is  taken  from  the  utility  library  of  zsh[Tea].  In  this  case,  the  Write-First 
set  generated  from  the  first  loop  is  {  typtab256  +  0  },  from  the  second  loop  {  typtabj2  +0,  typtabj2  +  128  }, 
find  from  the  last  statement  {  typtab  +  127  }.  Because  the  latter  two  Write-First,  sets  are  subsets  of  the  first, 
ordinary  MCA  would  ignore  them  and  use  the  first  set  as  the  overall  program  segment  summary.  However, 
in  using  the  assigned  constant  value  in  the  classification,  we  must  split  up  the  monolithic  LMAD  into  its 
uniform  constituent  parts.  The  overall  Write-First  summary  set  which  respects  constant  assignments  is: 
{  [typtab+  +  0]i«9,  [typtab+  +  32]0,  [typtab+  +  127]i<<9,  [typtab+  +  160]o  },  where  each  LMAD  is 
shown  with  its  initial  constant  value  as  an  outside  subscript. 


for  (  c  =  !a!;  c  <=  ’z!;  C++  ) 
re_ayntax_table[c]  =  1; 
for  (  C.  =  !A!:  c  <=  !Z!:  C++  ) 
rc_8yntax_tablc[c]  =  1; 
for  (  c  =  !Q!;  c  <=  !9!;  C++  ) 
re  ^syntax + ab  1  c  [c]  =  1; 
rc-Syntax_tablc[!-!]  =  1; 

(a)  Syntax  Table  Initialization 


for  (  i  =  0:  i  <  144:  i++  ) 
l[i]  =  8; 

for  (  :  i  <  256:  i++  ) 
l[i]  =  9; 

for  (  :  i  <  280:  i++  ) 

l[i]  =  7; 

for  (  ;  i  <  288;  i++  ) 

1  [i]  =  8; 

(b)  Huffman  Code  Initialization 


for  (  to  =  0;  to  !=  256:  t0++  ) 
typtabftO]  =  0: 

for  (  tO  =  0;  tO  !=  32;  t0++  ) 
typtabftO]  = 

typtab[t0+128]  =  1<<9; 
typtab[127]  =  1<<9; 


(e)  Type  Table  Initialization 


Figure  2:  Array  Initialization  Examples 
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4.1  Expression  Initializers 


Although  the  above  approach  works  well  for  constant  initializers,  in  practice  many  useful  initialization 
expressions  are  written  in  terms  of  index  and  other  program  variables.  Propagation  of  these  values  requires 
us  to  store  full  expressions  as  initializers.  These  expressions  can  then  be  evaluated  at  selected  values  of  the 
index  variable  using  a  sophisticated  symbolic  evaluation  subsystem,  such  as  that  described  by  Gerlek,  et 
al.[GSW95]  for  classifying  induction  sequences. 

Furthermore,  incorporating  this  capability  permits  us  to  retain  the  original  structure  of  the  analysis.  Instead 
of  keeping  array  sections  with  different  initializers  as  separate  LMADs,  we  can  coalesce  as  usual  and  maintain 
precise  initialization  information  using  a  complex  initialization  expression. 

For  example,  the  preceding  summary  Write-First  set  could  also  be  simplified  to: 

{  [typtabo56  +  0]0<=j&&;j<32||12r<=i&&i<160?(l<<9):0  } 

where  the  variable  i  is  the  offset  from  the  base  of  the  descriptor.  Although  somewhat  more  complex,  this 
scheme  is  preferable  because  it  provides  a  uniform  way  of  dealing  with  differing  neighbor  constant  initializers 
and  general  expression  initializations. 

It  is  also  worth  noting  that  significant  splintering  of  the  LMADs,  which  is  the  inevitable  alternative,  can 
impart  the  performance  of  the  MCA,  which  depends  on  maintaining  summary  sets  with  as  few  elements  as 
possible  to  keep  operations  between  rind  within  sets  efficient. 


5  A  Better  Memory  Classification  Analysis 

As  described  in  Section  3,  ordinary  Memory  Classification  Analysis  can  only  break  flow-dependences  of  an 
inductive  or  reductive  nature.  However,  our  Sparse  Constant  Propagation  enhancements  provide  the  tools 
to  break  certain  ordinary  flow  dependences  as  well.  Figure  3  shows  how  a  typical  array  dependence  taken 
from  the  dif  f  3  program  of  the  GNU  FSF  dif  futils  distribution[Foua]  is  broken. 

Figure  3(a)  shows  an  array  mapping  which  is  initialized  in  the  first  if-statement  grain,  and  used  in  the  second 
loop  grain.  These  two  grains  would  be  inseparable  by  Memory  Classification  Analysis  alone,  but  in  tandem 
with  Sparse  Constant  Propagation,  the  flow  dependence  can  be  broken  by  transforming  the  code  to  that- 
shown  in  Figure  3(b). 

This  is  performed  as  follows.  The  Write-First  set  for  the  earlier  grain  contains  the  LMAD 

{  [mappingl  +  $\i==WQiatrcmv{file[‘2)  ”-»)==0?(i==l?2:l):(i==l?l:2)  } 

for  mapping.  When  this  value  is  then  used  in  the  second  grain,  we  can  use  instead  the  closed  form  expression 
as  a  function  of  the  index  variable.  This  transformation  removes  the  LMAD  for  mapping  from  the  Read-Only 
summary  set  of  the  later  grain,  rendering  the  two  grains  parallelizeable. 

This  transformation  is  a  generalization  of  induction  variable  substitution.  In  induction  variable  substitution, 
a  closed  form  expression  for  fin  induction  variable  is  generated  in  terms  of  the  index  variable  to  break  a 
flow  dependence.  This  transformation  is  more  general  in  that  the  closed  form  expression  is  generated  from 
a  sequence  of  arbitrary  array  assignments. 


6  More  Complex  Examples 

In  Figure  4(a)-(c)  we  present  more  complex  cases.  In  all  three  cases,  the  Write-First  summary  set  generated 
by  the  first  loop  is:  {  [c\  +  0]i  }.  In  4(a),  the  second  loop  contains  a  Write-First  reference  to  c  and  to  a. 


o 


if  (strcmp  (file[2],  ==  0) 

{ 

mapping  [0]  =  0; 
mapping  [1]  =  2; 
mapping  [2]  =  1; 

} 

else 

{ 

mapping  [0]  =  0; 
mapping  [1]  =  1; 
mapping  [2]  =  2; 

} 

for  (i  =  0;  i  <  3;  i++) 

r  ev  snapping [mapping  [i]]  =  i; 


(a)  Original  Code 


if  (strcmp  (file  [2],  ==  0) 

{ 

mapping [0]  =  0; 
mapping  [1]  =  2; 
mapping [2]  =  1; 

} 

else 

{ 

mapping [0]  =  0; 
mapping[l]  =  1; 
mapping  [2]  =  2; 

} 

for  (i  =  0;  i  <  3;  i++) 

rev_mapping[i==0  ?  0  : 

strcrap(f ile  [2] , "-")==0  ? 
(i==l?2 : 1)  :  (i==l?l:2)]  =  i; 

(b)  Flow- Dependence  Free  Code 


Figure  3:  Broken  Flow  Dependence 


These  start  out  as  {  [c+i] 2,  [a  +  i] 2  },  where  the  initial  value  for  a  is  simply  copied  from  the  constant,  value 
assigned  to  c.  Expansion  through  the  loop  variable  results  in  {  [cj  +  0]2,  [aj  +  0]2  }.  There  is  no  internal 
overlap  within  descriptors,  no  overlap  between  descriptors  of  the  same  set,  mid  no  intersection  between  sets, 
so  no  further  analysis  is  needed. 

In  case  4(b),  the  value  written  to  each  element,  of  a  comes  from  array  c,  but  until  we  generate  a  summary 
of  the  behavior  of  the  entire  loop  we  cannot  be  sure  that  the  value  is  indeed  a  constant.  Therefore,  the 
Write-First  set  for  the  first  statement  in  the  second  loop  starts  out  as  {  [a  +  *]c[jj  }.  The  Read-Only  set  for 
the  statement  is  {  [c  +  i]  }.  The  Write-First  set  for  the  second  statement,  of  the  second  loop  is  {  [c  +  i]2  }, 
which  in  tandem  with  the  identical  Read-Only  set,  gets  converted  to  a  Read/ Write  set.  When  we  expand  the 
LMADs  through  the  i,  we  can  also  change  the  c[i]  initializer  of  the  LMAD  for  a  to  1,  because  it  is  now 
known  that  no  other  index  of  c  was  written  to  during  the  course  of  the  loop,  and  every  value  written  to  a 
from  c  must  have  come  from  outside  of  the  loop. 

Case  4(c)  is  complicated  because  expansion  of  the  LMADs  through  the  second  loop  index  and  subsequent 


for  (  i  =  0;  i  <  5;  i++  ) 
c  [i]  =  1; 

for  (  i  =  0;  i  <  5;  i++  ) 

{ 

c[i]  =2; 
a  [i]  =  c  [i]  ; 


(a)  Write  Before  Read 


for  (  i  =  0;  i  <  5;  i++  ) 
c[i]  =  1; 

for  (  i  =  0;  i  <  5;  i++  ) 

{ 

a[i]  =  c [i]  ; 
c[i]  =  2; 


(b)  Write  After  Read 


Figure  4:  More  Complex  Cases 


for  (  i  =  0;  i  <  5;  i++  ) 
c[i]  =  1; 

for  (  i  =  0;  i  <  5;  i++  ) 

{ 

a[i]  =  c [i]  ; 
c  [4-i]  =  1; 


(e)  Writes  and  Reads  Intertwined 
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application  of  the  Access  Region  Test  reveals  overlap  between  the  Read-Only  and  Write-First  summary 
sets  which  contain  overlapping  references  to  array  c.  In  this  and  similar  complex  cases,  a  good  deal  of 
sophistication  will  be  required  to  precisely  characterize  the  sequence  of  values  assigned  from  c.  Therefore, 
it  seems  appropriate  to  fall  back  on  more  general  and  expensive  element-level  methods  such  as  [SK98]. 


7  Limitations  and  Extensions 

The  Memory  Classification  analysis  described  by  Hoeflinger[Hoe98]  does  not  support  recursive  cycles  in  the 
call  graph.  Extension  of  MCA  by  iteration  over  these  cycles  would  make  the  method  more  general. 

Although  our  discussion  has  been  couched  only  in  terms  of  propagation  of  constants  and  expressions,  this 
technique  also  has  very  important  implications  for  aggregate  Shape  Analysis.  Shape  Analysis[SRW96]  com¬ 
putes  a  conservative  description  of  the  structure  of  program  heap  space  produced  by  a  series  of  pointer 
operations.  What  has  not  been  yet  been  shown  is  how  to  handle  shape  analysis  over  an  array  of  pointers. 
Using  the  present  constant  summarization  technique,  and  allowing  Static  Shape  Graphs [SRW96]  for  constant 
“values”,  we  can  obtain  a  precise  and  concise  description  even  of  arrays  of  heap  objects.  This  line  of  research 
has  been  pursued  in  a  companion  paper [Sch99]. 


8  Related  Work 


The  first  practical  algorithm  for  Constant  Propagation  based  on  SSA  form  was  by  Wegman  and 
Zadeck[WZ91j.  Extending  this  for  Sparse  Constant  Propagation  could  be  accomplished  only  inefficiently  by 
the  treatment  of  entire  aggregates  as  scalars  and  the  introduction  of  special  Access  and  Update  functions 
as  suggested  by  Cytron,  et.  al[CFR+91]. 

Choi,  et.  al[CCF94]  showed  that  introducing  these  functions  not  only  increases  the  volume  of  imprecise 
information  in  the  analysis,  but  actually  obscures  the  data-flow  by  replacing  all  def-def  chains  with  use-def 
chains  even  when  this  is  semantically  inaccurate.  Instead,  they  advocated  maintaining  def-def  chains,  but 
in  a  factored  form  which  would  keep  the  analysis  manageable. 

Steensgaard[Ste95]  improved  on  this  factorization  through  the  use  of  storage  fragmentation.  He  showed  that 
by  classifying  areas  of  storage  as  being  the  referents  of  only  a  limited  number  of  program  pointers,  further 
reduction  of  the  SSA  into  assignment  factored  form  was  possible.  He  also  demonstrated  that  finer  levels  of 
fragmentation  could  be  obtained  by  more  elaborate  pointer  analysis. 

Knobe  and  Sarkar’s  recent  introduction  of  Array  SSA  form[KS98]  has  provided  a  new  unified  framework  for 
reasoning  about  program  aggregates,  offering  precise  data  flow  information  on  a  per-element  basis.  However, 
this  is  also  its  biggest  weakness.  The  Array  SSA-based  Sparse  Constant  Propagation[SK98]  algorithm  they 
propose  cannot  deal  effectively  with  arrays  of  unbounded  size.  Indeed,  as  presented  in  their  paper,  Array  SSA 
form  relies  on  the  allocation  of  additional  space  of  the  same  magnitude  as  the  original  arrays,  something  which 
may  not  be  known  until  run-time.  True,  the  Array  SSA  technique  lends  itself  to  a  run-time  implementation, 
but  this  is  irrelevant  for  constant  propagation,  which  is  a  fundamentally  static  analysis. 

The  present  work  can  be  seen  as  extending  the  assignment  factored  SSA  form  of  Steensgaard  to  fragment  not 
only  program  aggregates  from  one  another,  but  to  logically  fragment  particular  aggregates  into  constituent 
pieces  according  to  the  manner  in  which  they  are  used  in  the  program.  The  original  MCA  merely  classifies 
memory  accesses  by  type  of  access;  this  work  introduces  the  finer  classification  within  write  accesses  of 
the  value  written,  producing  an  effective  Sparse  Constant  Propagation  algorithm  which  operates  within  the 
framework  of  MCA. 
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9  Conclusion 


Pack,  et  al.[PHP98]  expressed  the  belief  that  simplification  of  array  access  patterns  using  the  notions  of 
strides  and  spans  would  benefit  all  array-related  analyses.  This  study  has  shown  how  this  technique  may  be 
profitably  applied  to  the  long-standing  Sparse  Constant  Propagation  problem.  A  number  of  other  solutions 
have  been  proposed,  but  because  they  all  operate  fundamentally  at  the  element  level,  they  cannot  deal 
effectively  with  the  potentially  unlimited  volume  of  information  which  can  be  generated. 

Hoeflinger[Hoe98]  proposes  an  elaborate  interprocedural  analysis  framework  built  around  a  powerful  Region 
processor.  Benefits  of  this  would  be  both  conceptual  and  practical.  On  the  one  hand,  this  architecture  would 
neatly  sidestep  the  age-old  analysis  phase-ordering  dilemma.  On  the  other,  the  entire  system  would  benefit 
uniformly  from  incremental  improvements  to  this  central  processor.  This  phenomenon  has  been  amply 
demonstrated  in  this  work,  where  enhancing  the  Region  processor  for  Sparse  Constant  Propagation  has 
produced  a  more  powerful  Memory  Classification  Analysis  capable  of  breaking  array-based  flow  dependences. 
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